Pre-processing

  1. Adapters were trimmed using cutadapt v1.16
  2. Gene expression was quantified using salmon v1.3.0
  3. TPMs were obtained for the genes using tximport 1.20.0
library(dplyr)
library(ggplot2)
library(DESeq2)
library(tximport)
library(readr)
library(tximportData)
library(readxl)
library(knitr)
library(tidyverse)
library(pheatmap)
library(RColorBrewer)
library(viridis)
library(ggrepel)
library(EnhancedVolcano)
library(fgsea)
library(limma)

R Markdown

Define Color Scheme and plot them


Plasma 1:10
Viridis 1:10
Cividis 1:10
Magma 1:10

Generate a summary table of the samples sequenced and their sequencing and alignment metrics

Summary of Data Metrics
Sample patient reads %Q30 Duplication rate % reads with adapter STAR alignment number percent aligned splices annotated salmon mapping
Bulk HTB314 71816976 94.3 57 2.6 63614504 88.58 27340654 85.6967
Bulk MDS268 31441538 93.0 24 2.3 25549835 81.26 5546270 58.2075
Bulk MDS280 28794421 93.0 24 2.3 23317205 80.98 5115546 58.8614
CD123+ HTB314 61568500 94.0 56 2.7 55359277 89.91 22831187 86.4197
CD123+ MDS268 32307881 93.0 27 3.3 26243441 81.23 7290861 64.2150
CD123+ MDS280 23745205 93.0 34 2.6 21035358 88.59 5668653 71.3245
CD123- HTB314 83260626 94.0 58 2.6 74804722 89.84 31876605 87.6505
CD123- MDS268 27917999 93.0 26 2.7 22835189 81.79 5793322 63.9284
CD123- MDS280 24767573 93.0 32 2.6 22467937 90.72 4773823 62.7193

Import and Format Data for DeSEQ2

Sample heatmap using the spearman method and correlation heatmap

## PCA plot

PC1 vs PC2

## 3. Run Differential Expression testing using DESeq2 and Calculate Gene Set Enrichment ## Compare 123pos vs 123neg, 123neg vs bulk, and 123pos vs bulk #### sig = padj <0.01 and abs(l2fc) >0.5 ####

## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] 1405
## [1] 2382
## [1] 2336
## [1] 1206
## [1] 2157
## [1] 2130
## [1] 939
## [1] 1112

Volcano Plot

###MA plots

PC1 vs PC2 using a variety of parameters to understand the variation

GSEA analysis